Skip to main content

Web Collection 2012

Web crawl data from the year 2012. Some of this data is currently not publicly accessible.

193,036
RESULTS
rss


PART OF
Web Collections
Media Type
193,036
web
Topics & Subjects
152,948
crawldata
3,505
wiki
3,497
dumps
3,494
incremental
3,494
media
3,494
tape
More right-solid
Collection
More right-solid
Creator
140,981
internet archive
22,003
archive-it
11,966
thumper2.php
3,515
wikimedia projects editors
273
portuguese web archive
17
www.engadget.com
More right-solid
Language
3,037
English
275
Portuguese
26
Italian
19
Anyhub
13
Swedish
9
German
More right-solid
SHOW DETAILS
up-solid down-solid
eye
Title
Date Archived
Creator
Wide Crawl started January 2012
web
eye 4.2M
favorite 0
comment 0
Internet Archive crawldata from Webwide Crawl, captured by crawl423.us.archive.org:wide from Tue Jan 17 08:02:53 PST 2012 to Tue Jan 17 01:16:20 PST 2012.
Topic: crawldata
vkontakte.ru
web
eye 2M
favorite 0
comment 0
Source: vkontakte.ru
Wide Crawl started January 2012
web
eye 1.6M
favorite 0
comment 0
Internet Archive crawldata from Webwide Crawl, captured by crawl413.us.archive.org:wide from Sat Jan 21 04:01:50 PST 2012 to Fri Jan 20 21:01:34 PST 2012.
Topic: crawldata
Live Web Proxy Crawls
web
eye 963,994
favorite 0
comment 0
Internet Archive Liveweb Capture from WayBack Machine, captured by wwwb-gen1.us.archive.org from 2012-07-20T21:59:15 UTC to 2012-07-21T06:44:52 UTC.
Topic: crawldata
Youtube Videos
web
eye 875,484
favorite 0
comment 0
Internet Archive crawldata from YouTube Video archiving project 2011, captured by crawl440.us.archive.org:youtube from Sat Jul 21 05:39:04 PDT 2012 to Fri Jul 20 23:46:43 PDT 2012.
Topic: crawldata
recurrence=WEEKLY, maxDuration=259200, maxDocumentCount=null, isTestCrawl=false, seedCount=3, accountId=575, organizationName="Chicago-Kent College of Law", collectionId=2817, collectionName="Chicago-Kent College of Law"
recurrence=NONE, maxDuration=259200, maxDocumentCount=null, isTestCrawl=false, seedCount=12, accountId=593, organizationName="Government Printing Office", collectionId=3142, collectionName="CFPB"
Japan Earthquake
web
eye 835,419
favorite 0
comment 0
recurrence=NONE, maxDuration=259200, maxDocumentCount=null, isTestCrawl=false, seedCount=507, accountId=156, organizationName="Virginia Tech: Crisis, Tragedy, and Recovery Network", collectionId=2438, collectionName="Japan Earthquake"
Live Web Proxy Crawls
web
eye 805,301
favorite 0
comment 0
Internet Archive Liveweb Capture from WayBackMachine, captured by wwwb-gen1.us.archive.org:wbm from Tue Jan 3 12:37:06 PST 2012 to Tue Jan 3 06:31:55 PST 2012.
Topic: crawldata
The Archive Team Just In Time Grabs
web
eye 784,906
favorite 1
comment 0
Election Crawl 2012
web
eye 608,976
favorite 0
comment 0
Internet Archive crawldata uploaded by crawling119.us.archive.org:COL-ELECTION2012 from Fri Jun 29 21:24:09 PDT 2012 to Thu Jan 10 20:37:50 PST 2013.
Topic: crawldata
Wikipedia Outlinks February 2012
web
eye 602,938
favorite 0
comment 0
Internet Archive crawldata from wikipedia outbound links. captured by crawl435.us.archive.org:wpo from Thu Mar 1 20:56:37 PST 2012 to Thu Mar 1 14:19:48 PST 2012.
Topic: crawldata
Live Web Proxy Crawls
web
eye 542,906
favorite 0
comment 0
Internet Archive Liveweb Capture from WayBackMachine, captured by wwwb-gen1.us.archive.org:wbm from Sat May 12 21:38:20 PDT 2012 to Sat May 12 18:43:53 PDT 2012.
Topic: crawldata
Wide Crawl started September 2012
web
eye 529,237
favorite 0
comment 0
Internet Archive crawldata from Webwide Crawl, captured by crawl337.us.archive.org:wide from Wed Oct 17 08:14:47 PDT 2012 to Wed Oct 17 02:41:59 PDT 2012.
Topic: crawldata
Alexa Crawls
by thumper2.php
web
eye 482,171
favorite 0
comment 0
Alexa crawl
Topic: crawldata
Wikipedia Outlinks February 2012
web
eye 457,268
favorite 0
comment 0
Internet Archive crawldata from wikipedia outbound links. captured by crawl435.us.archive.org:wpo from Thu Oct 18 19:09:44 PDT 2012 to Thu Oct 18 13:30:19 PDT 2012.
Topic: crawldata
Wide Crawl started January 2012
web
eye 440,828
favorite 0
comment 0
Internet Archive crawldata from Webwide Crawl, captured by crawl420.us.archive.org:wide from Wed Jan 11 15:00:58 PST 2012 to Wed Jan 11 07:47:03 PST 2012.
Topic: crawldata
Wide Crawl started September 2012
web
eye 435,073
favorite 0
comment 0
Internet Archive crawldata from Webwide Crawl, captured by crawl339.us.archive.org:wide from Fri Oct 19 01:17:49 PDT 2012 to Thu Oct 18 21:38:24 PDT 2012.
Topic: crawldata
Wide Crawl started January 2012
web
eye 433,662
favorite 0
comment 0
Internet Archive crawldata from Webwide Crawl, captured by crawl413.us.archive.org:wide from Thu Jan 5 18:18:33 PST 2012 to Thu Jan 5 11:30:47 PST 2012.
Topic: crawldata
Wide Crawl started January 2012
web
eye 421,016
favorite 0
comment 0
Internet Archive crawldata from Webwide Crawl, captured by crawl413.us.archive.org:wide from Thu Jan 5 20:28:12 PST 2012 to Thu Jan 5 13:25:29 PST 2012.
Topic: crawldata
Live Web Proxy Crawls
web
eye 411,319
favorite 0
comment 0
Internet Archive Liveweb Capture from WayBackMachine, captured by wwwb-gen1.us.archive.org:wbm from Wed Mar 7 23:01:46 PST 2012 to Wed Mar 7 17:53:26 PST 2012.
Topic: crawldata
Live Web Proxy Crawls
web
eye 399,082
favorite 0
comment 0
Internet Archive Liveweb Capture from WayBackMachine, captured by wwwb-gen1.us.archive.org:wbm from Tue Jan 3 11:14:32 PST 2012 to Tue Jan 3 04:36:49 PST 2012.
Topic: crawldata
Live Web Proxy Crawls
web
eye 396,498
favorite 0
comment 0
Internet Archive Liveweb Capture from WayBackMachine, captured by wwwb-gen1.us.archive.org:wbm from Tue Jan 3 14:35:17 PST 2012 to Tue Jan 3 08:02:37 PST 2012.
Topic: crawldata
Archive Team: Dutch News Homepage Snapshots (2012-2016)
web
eye 389,801
favorite 0
comment 0
This item contains regular captures of Dutch news websites in screenshot and WARC format. Dit item bevat de homepages van Nederlandse nieuwswebsites als screenshot en in WARC-formaat. Websites: nos.nl teletekst.nos.nl rtlnieuws.nl nu.nl telegraaf.nl metronieuws.nl spitsnieuws.nl volkskrant.nl nrc.nl trouw.nl parool.nl fd.nl refdag.nl
Live Web Proxy Crawls
web
eye 386,112
favorite 0
comment 0
Internet Archive Liveweb Capture from WayBack Machine, captured by wwwb-live1.us.archive.org from 2012-11-22T06:44:42 UTC to 2012-11-22T18:14:08 UTC.
Topic: crawldata
Wide Crawl started January 2012
web
eye 385,434
favorite 0
comment 0
Internet Archive crawldata from Webwide Crawl, captured by crawl413.us.archive.org:wide from Thu Jan 5 16:52:35 PST 2012 to Thu Jan 5 10:17:00 PST 2012.
Topic: crawldata
Wide Crawl started January 2012
web
eye 369,871
favorite 0
comment 0
Internet Archive crawldata from Webwide Crawl, captured by crawl427.us.archive.org:wide from Tue Apr 17 00:58:46 PDT 2012 to Mon Apr 16 20:37:31 PDT 2012.
Topic: crawldata
Live Web Proxy Crawls
web
eye 369,520
favorite 0
comment 0
Internet Archive Liveweb Capture from WayBack Machine, captured by wwwb-live1.us.archive.org from 2012-12-13T10:53:21 UTC to 2012-12-13T16:53:32 UTC.
Topic: crawldata
Wide Crawl started September 2012
web
eye 362,772
favorite 0
comment 0
Internet Archive crawldata from Webwide Crawl, captured by crawl410.us.archive.org:wide from Wed Nov 7 02:28:32 PST 2012 to Tue Nov 6 20:50:40 PST 2012.
Topic: crawldata
Archive Team: Dutch News Homepage Snapshots (2012-2016)
web
eye 360,062
favorite 0
comment 0
This item contains regular captures of Dutch news websites in screenshot and WARC format. Dit item bevat de homepages van Nederlandse nieuwswebsites als screenshot en in WARC-formaat. Websites: nos.nl teletekst.nos.nl rtlnieuws.nl nu.nl telegraaf.nl metronieuws.nl spitsnieuws.nl volkskrant.nl nrc.nl trouw.nl parool.nl fd.nl refdag.nl
NLS_2012
web
eye 352,367
favorite 0
comment 0
Internet Archive crawldata uploaded by selenium-101.us.archive.org:NLS-CRAWL-004 from Sat Jun 16 16:29:25 PDT 2012 to Wed Dec 12 11:44:11 PST 2012.
Topic: crawldata
Live Web Proxy Crawls
web
eye 351,675
favorite 0
comment 0
Internet Archive Liveweb Capture from WayBack Machine, captured by wwwb-gen1.us.archive.org from 2012-06-26T02:05:01 UTC to 2012-06-26T10:40:25 UTC.
Topic: crawldata
Wide Crawl started January 2012
web
eye 349,006
favorite 0
comment 0
Internet Archive crawldata from Webwide Crawl, captured by crawl413.us.archive.org:wide from Thu Jan 5 15:53:45 PST 2012 to Thu Jan 5 08:51:54 PST 2012.
Topic: crawldata
Live Web Proxy Crawls
web
eye 347,603
favorite 0
comment 0
Internet Archive Liveweb Capture from WayBack Machine, captured by wwwb-live1.us.archive.org from 2012-12-31T05:25:29 UTC to 2013-01-01T02:09:59 UTC.
Topic: crawldata
google.co.jp
web
eye 347,005
favorite 0
comment 0
Source: google.co.jp
NLS_2012
web
eye 341,673
favorite 0
comment 0
Internet Archive crawldata uploaded by selenium-101.us.archive.org:NLS-CRAWL-004 from Wed Oct 31 04:28:48 PDT 2012 to Wed Dec 12 01:48:33 PST 2012.
Topic: crawldata
Live Web Proxy Crawls
web
eye 336,910
favorite 0
comment 0
Internet Archive Liveweb Capture from WayBack Machine, captured by wwwb-live0.us.archive.org from 2012-12-14T21:29:12 UTC to 2013-01-23T21:42:32 UTC.
Topic: crawldata
Live Web Proxy Crawls
web
eye 335,026
favorite 0
comment 0
Internet Archive Liveweb Capture from WayBack Machine, captured by wwwb-live0.us.archive.org from 2012-12-30T15:07:01 UTC to 2012-12-31T14:15:26 UTC.
Topic: crawldata
Live Web Proxy Crawls
web
eye 317,956
favorite 0
comment 0
Internet Archive Liveweb Capture from WayBackMachine, captured by wwwb-gen1.us.archive.org:wbm from Mon Mar 5 18:11:24 PST 2012 to Mon Mar 5 12:34:16 PST 2012.
Topic: crawldata
Live Web Proxy Crawls
web
eye 311,297
favorite 0
comment 0
Internet Archive Liveweb Capture from WayBack Machine, captured by wwwb-live0.us.archive.org from 2012-12-18T16:45:10 UTC to 2012-12-19T09:20:10 UTC.
Topic: crawldata
Live Web Proxy Crawls
web
eye 308,420
favorite 0
comment 0
Internet Archive Liveweb Capture from WayBack Machine, captured by wwwb-live1.us.archive.org from 2012-12-14T17:34:12 UTC to 2012-12-15T10:27:19 UTC.
Topic: crawldata
Archive Team: Dutch News Homepage Snapshots (2012-2016)
web
eye 304,175
favorite 0
comment 0
This item contains regular captures of Dutch news websites in screenshot and WARC format. Dit item bevat de homepages van Nederlandse nieuwswebsites als screenshot en in WARC-formaat. Websites: nos.nl teletekst.nos.nl rtlnieuws.nl nu.nl telegraaf.nl metronieuws.nl spitsnieuws.nl volkskrant.nl nrc.nl trouw.nl parool.nl fd.nl refdag.nl
Wide Crawl started April 2012
web
eye 303,317
favorite 0
comment 0
Internet Archive crawldata from Webwide Crawl, captured by crawl422.us.archive.org:wide from Fri Jun 8 07:28:07 PDT 2012 to Fri Jun 8 02:31:40 PDT 2012.
Topic: crawldata
The Archive Team Just In Time Grabs
web
eye 300,594
favorite 1
comment 0
Live Web Proxy Crawls
web
eye 296,441
favorite 0
comment 0
Internet Archive Liveweb Capture from WayBack Machine, captured by wwwb-live0.us.archive.org from 2012-11-22T04:31:46 UTC to 2012-11-22T19:27:11 UTC.
Topic: crawldata
Wide Crawl started January 2012
web
eye 294,233
favorite 0
comment 0
Internet Archive crawldata from Webwide Crawl, captured by crawl413.us.archive.org:wide from Thu Jan 5 19:32:22 PST 2012 to Thu Jan 5 12:20:54 PST 2012.
Topic: crawldata
Wide Crawl started January 2012
web
eye 291,313
favorite 0
comment 0
Internet Archive crawldata from Webwide Crawl, captured by crawl336.us.archive.org:wide from Wed Jan 18 17:01:06 PST 2012 to Wed Jan 18 09:17:47 PST 2012.
Topic: crawldata
Live Web Proxy Crawls
web
eye 290,577
favorite 0
comment 0
Internet Archive Liveweb Capture from WayBackMachine, captured by wwwb-gen1.us.archive.org:wbm from Tue Feb 28 04:16:59 PST 2012 to Tue Feb 28 03:44:00 PST 2012.
Topic: crawldata
Wide Crawl started April 2012
web
eye 290,535
favorite 0
comment 0
Internet Archive crawldata from Webwide Crawl, captured by crawl426.us.archive.org:wide from Sun May 13 07:09:39 PDT 2012 to Sun May 13 01:49:20 PDT 2012.
Topic: crawldata
Archive Team: Dutch News Homepage Snapshots (2012-2016)
web
eye 289,077
favorite 0
comment 0
This item contains regular captures of Dutch news websites in screenshot and WARC format. Dit item bevat de homepages van Nederlandse nieuwswebsites als screenshot en in WARC-formaat. Websites: nos.nl teletekst.nos.nl rtlnieuws.nl nu.nl telegraaf.nl metronieuws.nl spitsnieuws.nl volkskrant.nl nrc.nl trouw.nl parool.nl fd.nl refdag.nl
Wikipedia Outlinks February 2012
web
eye 287,004
favorite 0
comment 0
Internet Archive crawldata from wikipedia outbound links. captured by crawl435.us.archive.org:wpo from Sat Feb 11 14:58:31 PST 2012 to Sat Feb 11 08:33:22 PST 2012.
Topic: crawldata
Live Web Proxy Crawls
web
eye 286,922
favorite 0
comment 0
Internet Archive Liveweb Capture from WayBack Machine, captured by wwwb-live1.us.archive.org from 2012-12-30T19:10:53 UTC to 2012-12-31T10:26:34 UTC.
Topic: crawldata
Live Web Proxy Crawls
web
eye 285,461
favorite 0
comment 0
Internet Archive Liveweb Capture from WayBackMachine, captured by wwwb-gen1.us.archive.org:wbm from Sun Jan 1 09:53:25 PST 2012 to Sun Jan 1 03:42:17 PST 2012.
Topic: crawldata
Live Web Proxy Crawls
web
eye 285,025
favorite 0
comment 0
Internet Archive Liveweb Capture from WayBack Machine, captured by wwwb-gen1.us.archive.org from 2012-07-30T04:16:42 UTC to 2012-07-30T11:28:26 UTC.
Topic: crawldata
ask.com
web
eye 284,682
favorite 0
comment 0
Source: ask.com
Live Web Proxy Crawls
web
eye 283,526
favorite 0
comment 0
Internet Archive Liveweb Capture from WayBackMachine, captured by wwwb-gen1.us.archive.org:wbm from Sun Feb 12 06:50:14 PST 2012 to Sun Feb 12 02:22:07 PST 2012.
Topic: crawldata
reddit.com
web
eye 283,022
favorite 0
comment 0
Source: reddit.com
reddit.com
web
eye 281,363
favorite 0
comment 0
Source: reddit.com
Wide Crawl started April 2012
web
eye 280,552
favorite 0
comment 0
Internet Archive crawldata from Webwide Crawl, captured by crawl413.us.archive.org:wide from Thu May 3 12:36:06 PDT 2012 to Thu May 3 07:30:25 PDT 2012.
Topic: crawldata
Live Web Proxy Crawls
web
eye 280,406
favorite 0
comment 0
Internet Archive Liveweb Capture from WayBack Machine, captured by wwwb-live1.us.archive.org from 2012-12-27T00:05:42 UTC to 2012-12-27T16:35:17 UTC.
Topic: crawldata
Live Web Proxy Crawls
web
eye 276,333
favorite 0
comment 0
Internet Archive Liveweb Capture from WayBack Machine, captured by wwwb-live1.us.archive.org from 2012-10-30T19:58:33 UTC to 2012-10-31T18:47:34 UTC.
Topic: crawldata
Live Web Proxy Crawls
web
eye 275,577
favorite 0
comment 0
Internet Archive Liveweb Capture from WayBack Machine, captured by wwwb-live1.us.archive.org from 2012-12-28T19:46:00 UTC to 2012-12-29T15:05:14 UTC.
Topic: crawldata
Archive Team: Dutch News Homepage Snapshots (2012-2016)
web
eye 274,523
favorite 0
comment 0
This item contains regular captures of Dutch news websites in screenshot and WARC format. Dit item bevat de homepages van Nederlandse nieuwswebsites als screenshot en in WARC-formaat. Websites: nos.nl teletekst.nos.nl rtlnieuws.nl nu.nl telegraaf.nl metronieuws.nl spitsnieuws.nl volkskrant.nl nrc.nl trouw.nl parool.nl fd.nl refdag.nl
ameblo.jp
web
eye 274,382
favorite 0
comment 0
Source: ameblo.jp
Live Web Proxy Crawls
web
eye 273,445
favorite 0
comment 0
Internet Archive Liveweb Capture from WayBack Machine, captured by wwwb-live1.us.archive.org from 2012-12-07T06:06:22 UTC to 2012-12-10T21:54:23 UTC.
Topic: crawldata
Live Web Proxy Crawls
web
eye 273,317
favorite 0
comment 0
Internet Archive Liveweb Capture from WayBack Machine, captured by wwwb-live0.us.archive.org from 2012-10-15T10:26:27 UTC to 2012-10-15T18:41:53 UTC.
Topic: crawldata
Live Web Proxy Crawls
web
eye 272,974
favorite 0
comment 0
Internet Archive Liveweb Capture from WayBack Machine, captured by wwwb-gen1.us.archive.org from 2012-07-27T10:56:04 UTC to 2012-07-27T17:15:08 UTC.
Topic: crawldata
Live Web Proxy Crawls
web
eye 269,954
favorite 0
comment 0
Internet Archive Liveweb Capture from WayBack Machine, captured by wwwb-live1.us.archive.org from 2012-12-20T18:25:28 UTC to 2012-12-21T16:41:39 UTC.
Topic: crawldata
NLS_2012
web
eye 268,230
favorite 0
comment 0
Internet Archive crawldata uploaded by selenium-101.us.archive.org:NLS-CRAWL-004 from Sun Jul 8 09:00:15 PDT 2012 to Wed Dec 12 01:48:28 PST 2012.
Topic: crawldata
Live Web Proxy Crawls
web
eye 267,138
favorite 0
comment 0
Internet Archive Liveweb Capture from WayBackMachine, captured by wwwb-gen1.us.archive.org:wbm from Sat Mar 24 09:36:36 PDT 2012 to Sat Mar 24 08:42:19 PDT 2012.
Topic: crawldata
Live Web Proxy Crawls
web
eye 266,504
favorite 0
comment 0
Internet Archive Liveweb Capture from WayBack Machine, captured by wwwb-live0.us.archive.org from 2012-12-14T22:44:21 UTC to 2012-12-15T05:16:24 UTC.
Topic: crawldata
Live Web Proxy Crawls
web
eye 264,315
favorite 0
comment 0
Internet Archive Liveweb Capture from WayBack Machine, captured by wwwb-live1.us.archive.org from 2012-12-27T07:49:06 UTC to 2012-12-28T07:29:11 UTC.
Topic: crawldata
yahoo.co.jp
web
eye 263,644
favorite 0
comment 0
Source: yahoo.co.jp
Wide Crawl started September 2012
web
eye 263,639
favorite 0
comment 0
Internet Archive crawldata from Webwide Crawl, captured by crawl425.us.archive.org:wide from Wed Nov 14 17:39:11 PST 2012 to Wed Nov 14 10:55:11 PST 2012.
Topic: crawldata
Live Web Proxy Crawls
web
eye 263,444
favorite 0
comment 0
Internet Archive Liveweb Capture from WayBack Machine, captured by wwwb-live0.us.archive.org from 2012-12-28T22:26:32 UTC to 2012-12-29T15:19:50 UTC.
Topic: crawldata